Complex Concept Acquisition through Directed Search and Feature Caching

نویسندگان

  • Harish Ragavan
  • Larry A. Rendell
  • Michael J. Shaw
  • Antoinette Tessmer
چکیده

Difficult concepts arise in many complex, formative , or poorly understood real-world domains. High interaction among the data attributes causes problems for many learning algorithms, including greedy decision-tree builders, extensions of basic methods, and even backpropagation and MARS. A new algorithm, LFC uses directed lookahead search to address feature interaction, improving hypothesis accuracy at reasonable cost. LFC also addresses a second problem, the general ver-bosity or global replication problem. The algorithm caches search information as new features for decision tree construction. The combination of these two design factors leads to improved prediction accuracy, concept compact-ness, and noise tolerance. Empirical results with synthetic boolean concepts, bankruptcy prediction and bond rating show typical accuracy improvement of 15%-20% with LFC over several alternative algorithms in cases of moderate feature interaction. LFC also explicates latent relationships in the training data to provide useful intermediate concepts from the perspective of domain experts. 1 Introduction We describe a learning method for complex concepts and practical results in the difficult financial domain of risk classification. Financial data are collected in the form of observations and an overall assessment (often binary—to lend or not to lend). This constitutes an attribute-value representation, where each training example is described by an n-tuple of attribute values and a class value. A concept is an intensional description of the class; a learning algorithm constructs a hypothesis of the concept. Learning that assumes class-membership varies little as hypothesis descriptions are modified slightly has been called similariiy-bastd (SBL). Shaw and Gentry [1990] have shown advantages of SBL over traditional financial classification methods such as discriminant analysis. Some statistical approaches allow nonlinearity, but SBL is more readily comprehensible and facilitates human-computer interaction. While many concepts pose no problems for typical decision tree induction algorithms, complex domains often produce inaccurate trees. As shown later, C4.5 give low predictive accuracies for finacial concepts. Other methods such as improved SBL and backpropagation also fare poorly. Financial risk classification is difficult because of interaction among the attributes. In a critique of current learning systems, Rendell and Ragavan [1993, this vol-ume] examine relationships among attribute interaction, real-world problems, and system requirements. This paper explores a new algorithm called LFC (lookahead feature construction) which applies directed lookahead search to build decision trees and form new features. LFC uses several techniques to decide dynamically which paths to attempt, how far to pursue the search, and which new features provide the most concise …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Caching Strategy in Video-on-Demand (VoD) Peer-to-Peer (P2P) Networks Based on Complex Network Theory

The popularity of video-on-demand (VoD) streaming has grown dramatically over the World Wide Web. Most users in VoD P2P networks have to wait a long time in order to access their requesting videos. Therefore, reducing waiting time to access videos is the main challenge for VoD P2P networks. In this paper, we propose a novel algorithm for caching video based on peers' priority and video's popula...

متن کامل

A Novel Caching Strategy in Video-on-Demand (VoD) Peer-to-Peer (P2P) Networks Based on Complex Network Theory

The popularity of video-on-demand (VoD) streaming has grown dramatically over the World Wide Web. Most users in VoD P2P networks have to wait a long time in order to access their requesting videos. Therefore, reducing waiting time to access videos is the main challenge for VoD P2P networks. In this paper, we propose a novel algorithm for caching video based on peers' priority and video's popula...

متن کامل

The Interlanguage of Persian Learners of Italian: a Focus on Complex Predicates

This paper aims at investigating the acquisition of Italian complex predicates by native speakers of Persian. Complex predication is not as pervasive a phenomenon in Italian as it is in Persian. Yet Italian native speakers use complex predicates productively; spontaneous data show that Persian learners of Italian seem to be perfectly aware of Italian complex predicates and use this familiar fea...

متن کامل

Methodology of conceptual review in the health system

Background: Conceptual review is a creative research method for generating new knowledge in the context of a vague and complex concept that helps to explain and clarify the concept, its components and its relation to related concepts. This study aimed to explain the methodology of conceptual review in the health system. Methods: Articles related to the conceptual research method were searched ...

متن کامل

The Acquisition and Development of the Concepts of English Modality Through Metaphors

The current study attempted to investigate the effect of Systemic-Theoretical Instruction (STI) on English-language learners’ acquisition of the concepts of modal verbs through exploiting metaphors. To this end, the effect of the main treatment of the study, i.e. Concept-Based Instruction (CBI) was investigated through conceptual metaphors, in two gender (male vs. female) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993